Working With Vectors(Part 1)

data science data types programming R

Part one of working with vectors in R.

Danielle Brantley https://gist.github.com/danielle-b
01-13-2020

For this post, I’ll discuss the working with vectors lesson on DataQuest’s Data Analyst in R track. I’m going to dive deeper into vectors, talking about indexing in R and R’s different data types. There was so much that I learned in this lesson that I decided to break this topic up into two posts. So let’s jump right in!

As mentioned in my last post, vectors are storage objects that stores a sequence of values. Vectors can be indexed to select the subset of elements they contain. Each element of a vector is assigned a position. This reminds me of lists in Python with one key difference. R is an 1-indexed programming language so the first element in a vector is assigned the position of one. This is very different from what I learned in Python or JavaScript where indexing starts at zero. I think it’s a bit easier though because when we count we usually start from one.

Indexing By Position

You can index vectors by position using a single element, a range of elements or multiple elements. Let’s take the example vector from my previous Intro To R Programming post.

Index Using a Single Element:

tea_price[2]
[1] 4

Index by Range of Elements:

tea_price[2:4]
[1] 4 2 2

Index Using Multiple Elements:

tea_price[c(1, 3, 5)]
[1] 5 2 3

R’s Data Types

Like other programming languages, R also has data types. In this lesson, DataQuest presents three basic data types:

Numeric(Double)– consists of numbers( 4, 32.6, 32.67521, -2 , -7.9)

Character– consists of letters, numbers and special characters. Character data is surrounded by quotation marks (“snake”, “&”, “%” , “special0964”, “cool + beans”)

Logical-stores boolean values TRUE and FALSE

To find out the type of data type I’m working with, I learned to use typeof() function.

This is where things got a bit confusing for me. I decided to try this for myself in both R Studio and the R console that comes with R. I ended up with the following:

typeof(88)
[1] "double"
typeof(6.7)
[1] "double"
typeof(-0.75)
[1] "double"
typeof(-4.93)
[1] "double"
typeof(47)
[1] "double"
typeof(392)
[1] "double"
typeof(9.6666667)
[1] "double"

As you can see when I use the typeof() function on both whole numbers and decimals, it doesn’t come back as numeric, it comes back as double. From the lesson, I understood double to mean just decimal numbers.

I decided to consult the #rstats community on Twitter.

Posing a question to #rstats twitter
Posing a question to #rstats twitter

A few members of the #rstats community, Maarten Demeyer, Tyson Barrett, and Colin Fay came through with helpful responses that helped clarify the confusion I had. A special thank you to them! Check out their responses here.

What I gathered from these responses and further research is that numeric is a class and the double data type fall into that class. There are also more data types in R but I won’t get into those right now.

Names() Function

I also learned that in R you can assign names to vector elements. To do this, we use the names() function. Again, taking the example from my Intro to R Programming post, if I wanted to assign the values contained in tea_flavors to the values stored in tea_prices, it would look like this:

tea_flavors <- c("chai", "matcha", "black", "green", "white")
tea_prices <- c(5, 4, 2, 2, 3)
names(tea_prices) <- tea_flavors
tea_prices
  chai matcha  black  green  white 
     5      4      2      2      3 

I could also use the names() function to return the names of the elements in a vector like this:

[1] "chai"   "matcha" "black"  "green"  "white" 

Indexing By Name

Lastly for part one, I’ll talk about indexing by name. In R, you can index vectors by name. Note that I get the same result when I index by name and position.

tea_prices["matcha"]
matcha 
     4 
tea_prices[2]
matcha 
     4 

Earlier in this post you saw how I indexed by position using multiple elements. You can do the same when indexing by name.

tea_prices[c("black", "white")]
black white 
    2     3 

Whew, that was a lot to cover! That’s it for Part One working with vectors. Part Two is coming soon!

Citation

For attribution, please cite this work as

Brantley (2020, Jan. 13). Data Sci Dani: Working With Vectors(Part 1). Retrieved from https://datascidani.com/posts/working_with_vectors_one 01-13-20/

BibTeX citation

@misc{brantley2020working,
  author = {Brantley, Danielle},
  title = {Data Sci Dani: Working With Vectors(Part 1)},
  url = {https://datascidani.com/posts/working_with_vectors_one 01-13-20/},
  year = {2020}
}